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Producibility in hierarchical self-assembly 
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Abstract 



^D • Three results are shown on producibihty in the hierarchical model of tile self-assembly. 



It is shown that a simple greedy polynomial-time strategy decides whether an assembly a is 



^^' producible. The algorithm can be optimized to use 0(|a| log |q;|) time. 



Cannon, Demaine, Demaine, Eisenstat, Patitz, Schweller, Summers, and Winslow [5] showed 
that the problem of deciding if an assembly a is the unique producible terminal assembly of a 
tile system T can be solved in 0(|ap|T| -I- |a||Tp) time for the special case of noncooperative 
"temperature 1" systems. It is shown that this can be improved to 0(|a||T| log \T\) time. 

Finally, it is shown that if two assemblies are producible, and if they can be overlapped 
consistently - i.e., if the positions that they share have the same tile type in each assembly - 



^H ! then their union is also producible. 
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1 Introduction 

1.1 Background of the field 

Winfree's abstract Tile Assembly Model (aTAM) [30] is a model of crystal growth through coopera- 
tive binding of square-like monomers called tiles, implemented experimentally (for the current time) 
by DNA [4,32]. In particular, it models the potentially algorithmic capabilities of tiles that can 
be designed to bind if and only if the total strength of attachment (summed over all binding sites, 
called glues on the tile) is at least a parameter r, sometimes called the temperature. In particular, 
when the glue strengths are integers and r = 2, this implies that two strength 1 glues must cooper- 
ate to bind the tile to a growing assembly. Two assumptions are key: 1) growth starts from a single 
specially designated seed tile type, and 2) only individual tiles bind to an assembly, never larger 
assemblies consisting of more than one tile type. We will refer to this model as the seeded aTAM. 
While violations of these properties are often viewed as errors in implementation of the seeded 
aTAM [25,26], relaxing the assumptions results in a different model with its own programmable 
abilities. In the hierarchical (a.k.a. multiple tile [3], polyomino [20,31], two-handed [5,10,14]) 
aTAM, there is no seed tile, and an assembly is considered producible so long as two producible 
assemblies are able to attach to each other with strength at least r, with all individual tiles being 
considered as "base case" producible assemblies. In either model, an assembly is considered termi- 
nal if nothing can attach to it; viewing self-assembly as a computation, terminal assembly(ies) are 
often interpreted to be the output. See [12,21] for an introduction to recent work in these models. 

The hierarchical aTAM has attracted considerable recent attention. It is coNP-complete to 
decide whether an assembly is the unique terminal assembly produced by a hierarchical tile sys- 
tem [5]. There are infinite shapes that can be assembled in the hierarchical aTAM but not the seeded 
aTAM, and vice versa, and there are finite shapes requiring strictly more tile types to assemble 
in the seeded aTAM than the hierarchical aTAM, and vice versa [5]. Despite this incomparability 
between the models for exact assembly of shapes, with a small blowup in scale, any seeded tile 
system can be simulated by a hierarchical tile system [5], improving upon an earlier scheme that 
worked for restricted classes of seeded tile systems [20]. However, the hierarchical aTAM is not 
able to simulate itself from a single set of tile types, i.e., it is not intrinsically universal [10], unlike 
the seeded aTAM [13]. It is possible to assemble an n x n square in a hierarchical tile system with 
O(logn) tile types that exhibits a very strong form of fault-tolerance in the face of spurious growth 
via strength 1 bonds [14]. The parallelism of the hierarchical aTAM suggests the possibility that it 
can assemble shapes faster than the seeded aTAM, but it cannot for a wide class of tile systems [6]. 

Interesting variants of the hierarchical aTAM introduce other assumptions to the model. The 
multiple tile model retains a seed tile and places a bound on the size of assemblies attaching to 
it [3]. Under this model, it is possible to modify a seeded tile system to be self-healing, that is, 
it correctly regrows when parts of itself are removed, even if the attaching assemblies that refill 
the removed gaps are grown without the seed [31]. The model of staged assembly allows multiple 
test tubes to undergo independent growth, with excess incomplete assemblies washed away (e.g. 
purified based on size) and then mixed, with assemblies from each tube combining via hierarchical 
attachment [8,9]. The RNase enzyme model [1,11,22] assumes some tile types to be made of 
RNA, which can be digested by an enzyme called RNase, leaving only the DNA tiles remaining, 
and possibly disconnecting what was previously a single RNA/DNA assembly into multiple DNA 
assemblies that can combine via hierarchical attachment. Introducing negative glue strengths into 
the hierarchical aTAM allows for "fuel-efficient" computation [27]. 



1.2 Contributions of this paper 

We show three results related to producibility in the hierarchical aTAM. 

1. In the seeded aTAM, there is an obvious linear-time algorithm to test whether assembly a is 
producible by a tile system: starting from the seed, try to attach tiles until a is complete or 
no more attachments are possible. We show that in the hierarchical aTAM, a similar greedy 
strategy correctly identifies whether a given assembly is producible, though it is more involved 
to prove that it is correct. The idea is to start with all tiles in place as they appear in a, but 
with no bonds, and then to greedily bind attachable assemblies until a is assembled. The 
algorithm can be optimized to use 0(|a| log |a|) time. It is not obvious that this works, since 
it is conceivable that assemblies must attach in a certain order for a to form, but the greedy 
strategy may pick another order and hit a dead-end in which no assemblies can attach. This 
is shown in Section 3. 

2. We show that there is a faster algorithm for the temperature 1 Unique Production Verifi- 
cation (UPV) problem studied by Cannon, Demaine, Demaine, Eisenstat, Patitz, Schweller, 
Summers, and Winslow [5] , which is the problem of determining whether assembly a is the 
unique producible terminal assembly of tile system T, where T has temperature 1, meaning 
that all positive glue strength are sufficiently strong to attach any two assemblies. They give 
an algorithm that runs in 0(|ap|T| + |«||Tp) time. We show how to improve this running 
time. Cannon et al. proved their result by using an 0(|a|^ + |a||T|) time algorithm for UPV 
that works in the seeded aTAM [2], and then reduced the hierarchical temperature-1 UPV 
problem to \T\ instances of the seeded UPV problem. We improve this result by showing that 
a faster 0(|a| log |T|) time algorithm for the seeded UPV problem exists for the special case 
of temperature 1, and then we apply the technique of Cannon et al. relating the hierarchical 
problem to the seeded problem to improve the running time of the hierarchical algorithm to 
0(|a||T| log |T|). This is shown in Section 4. 

3. We show that if two assemblies a and /3 are producible in the hierarchical model, and if they 
can be overlapped consistently (i.e., if the positions that they share have the same tile type 
in each assembly), then their union a U /3 is producible. This is trivially true in the seeded 
model, but it requires more care to prove in the hierarchical model. It is conceivable a 'priori 
that although /3 is producible, /3 must assemble a H /3 in some order that is inconsistent with 
how a assembles a fl /3. This is shown in Section 5. 

2 Informal definition of the abstract tile assembly model 

This section gives a brief informal sketch of the seeded and hierarchical variants of the abstract 
Tile Assembly Model (aTAM). See Section A for a formal definition of the aTAM. 

A tile type is a unit square with four sides, each consisting of a glue label (often represented as 
a finite string) and a nonnegative integer strength. We assume a finite set T of tile types, but an 
infinite number of copies of each tile type, each copy referred to as a tile. If a glue has strength 
0, we say it is null, and if a positive-strength glue facing some direction does not appear on some 
tile type in the opposite direction, we say it is functionally null. We assume that all tile sets in 
this paper contain no functionally null glues. An assembly is a positioning of tiles on the integer 
lattice Z^; i.e., a partial function a : Z^ --->■ T. We write |a| to denote |dom a\. Write a C /? 



to denote that a is a subassembly of /?, which means that dom a C dom /3 and a{p) = (3{p) for 
aU points p G dom a. In this case, say that /3 is a superassembly of a. We abuse notation and 
take a tile type t to be equivalent to the single-tile assembly containing only t (at the origin if not 
otherwise specified). Two adjacent tiles in an assembly interact if the glue labels on their abutting 
sides are equal and have positive strength. Each assembly induces a binding graph, a grid graph 
whose vertices are tiles, with an edge between two tiles if they interact. The assembly is T-stable if 
every cut of its binding graph has strength at least r, where the weight of an edge is the strength 
of the glue it represents. 

A seeded tile assembly system (seeded TAS) is a triple T = {T,a,T), where T is a finite set of 
tile types, a : 1? --->■ T is a finite, r-stable seed assembly, and r is the temperature. If T has a 
single seed tile s G T (i.e., (j(0, 0) = s for some s (z T and is undefined elsewhere), then we write 
T = (T, s, r). Let \T\ denote \T\. An assembly a is producible if either q = o" or if /3 is a producible 
assembly and a can be obtained from /3 by the stable binding of a single tile. In this case write 
/3 ^-1 a {a is producible from /3 by the attachment of one tile), and write /3 ^^ a if /3 —t-^ a (a is 
producible from /3 by the attachment of zero or more tiles). An assembly is terminal if no tile can 
be r-stably attached to it. 

A hierarchical tile assembly system (hierarchical TAS) is a pair T = {T,t), where T is a finite 
set of tile types and r G N is the temperature. An assembly is producible if either it is a single 
tile from T, or it is the r-stable result of translating two producible assemblies without overlap. 
Therefore, if an assembly a is producible, then it is produced via an assembly tree, a full binary 
tree whose root is labeled with a, whose |a| leaves are labeled with tile types, and each internal 
node is a producible assembly formed by the stable attachment of its two child assemblies. An 
assembly a is terminal if for every producible assembly /3, a and (3 cannot be r-stably attached. If 
a can grow into /3 by the attachment of zero or more assemblies, then we write a ^>- (3. 

Note that our definitions imply only finite assemblies are producible. In either the seeded or 
hierarchical model, let A\T\ be the set of producible assemblies of T, and let ^n[7^ ^ A\T\ be the 
set of producible, terminal assemblies of T. A TAS T is directed (a.k.a., deterministic, confluent) if 
I .An [7^ I = 1. If T is directed with unique producible terminal assembly a, we say that T uniquely 
produces a. It is easy to check that in the seeded aTAM, T uniquely produces a if and only if 
every producible assembly /3 C q. In the hierarchical model, a similar condition holds, although it 
is more complex since hierarchical assemblies, unlike seeded assemblies, do not have a "canonical 
translation" defined by the seed. T uniquely produces a if and only if for every producible assembly 
/3, there is a translation /3' of (3 such that (3' ^ a. In particular, if there is a producible assembly 
j3 ^ a such that dom a = dom (3, then a is not uniquely produced. Since dom j3 = dom a, every 
nonzero translation of (3 has some tiled position outside of dom a, whence no such translation can 
be a subassembly of a, implying a is not uniquely produced. 

3 Polynomial- time verification of production 

Let 5 be a finite set. A partition of S* is a collection C = {Ci, . . . , C^} C V{S) such that (Ji=i Ci = S 
and for dl\ i ^ j , Ci {~^ C j = . A hierarchical division of 5 is a full binary tree T (a tree in which 
every internal node has exactly two children) whose nodes represent subsets of S, such that the 
root of T represents S, the IS*! leaves of T represent the singleton sets {x} for each x G S, and each 
internal node has the property that its set is the (disjoint) union of its two childrens' sets. 



Lemma 3.1. Let S be a finite set with \S\ > 2. Let T be any hierarchical division of S, and let C 
be any partition of S other than {S}. Then there exist Ci,C2 € C with Ci 7^ C2, and there exist 
C'l C Ci and C2 ^ C2, such that C[ and C2 are siblings in T. 

Proof. First, label each leaf {x} of T with the unique element Ci € C such that x G Ci. Next, 
iteratively label internal nodes according to the following rule: while there exist two children of a 
node u that have the same label, assign that label to u. Notice that this rule preserves the invariant 
that each labeled node u (representing a subset of S) is a subset of the set its label represents. 
Continue until no node has two identically-labeled children. C contains only proper subsets of S, 
so the root (which is the set S) cannot be contained in any of them, implying the root will remain 
unlabeled. Follow any path starting at the root, always following an unlabeled child, until both 
children of the current internal node are labeled. (The path may vacuously end at the root.) Such a 
node is well-defined since at least all leaves are labeled. By the stopping condition stated previously, 
these children must be labeled differently. The children are the witnesses C[ and Cg, with their 
labels having the values Ci and C2, testifying to the truth of the lemma. D 

Lemma 3.1 will be useful when we view T as an assembly tree for some producible assembly a, 
and we view C as a partially completed attempt to construct another assembly tree for a, where 
each element of C is a subassembly that has been produced so far. 

When we say "by monotonicity" , this refers to the fact that glue strengths are nonnegative, 
which implies that if two assemblies a and /3 can attach, the addition of more tiles to either a or 
(3 cannot prevent this binding, so long as the additional tiles do not overlap the other assembly. 

We want to solve the following problem: given an assembly a and temperature r, is a pro- 
Algorithm 1 IS-PRODUCIBLE-ASSEMBLY(a, t) 
1: input: assembly a and temperature r 

2: C ^ { {v} I V € dom a } / / (positions defining) subassemblies of a; initially individual (po- 
sitions of) tiles 
3: while \C\ > 1 do 

4: if there exist Ci, Cj € C with glues between Ci and Cj of total strength at least r then 
5: C^{C\{Ci,Cj})U{CiUCj} 

6: else 

7: print "a is not producible" and exit 

8: end if 
9: end while 
10: print "a is producible" 

ducible in the hierarchical aTAM at temperature r?^ The algorithm Is-Producible-Assembly 
(Algorithm 1) solves this problem. 

Theorem 3.2. There is an 0(ja|log \a\) algorithm deciding whether an assembly a is producible 
at temperature r in the hierarchical aTAM. 

Proof. Correctness: Is-Producible-Assembly works by building up the initially edge-free 
graph with the tiles of a as its nodes (the algorithm stores the nodes as points in Z^, but a 



^We do not need to give the tile set T as input because the tiles in a iniphcitly define a tile set, and the presence 
of extra tile types in T that do not appear in a cannot affect its producibility. 



would be used in step 4 to get the glues and strengths between tiles at adjacent positions), stop- 
ping when the graph becomes connected. The order in which connected components (implicitly 
representing assemblies) are removed from and added to C implicitly defines a particular assembly 
tree with a at the root (for every Ci, C2 processed in line 5, the assembly a \ (Ci U C2) is a parent 
of a \ Ci and a f C2 in the assembly tree). Therefore, if the algorithm reports that a is producible, 
then it is. Conversely, suppose that a is producible via assembly tree T. Let C = {Ci, . . . , Ck} be 
the set of assemblies at some iteration of the loop at line 3. It suffices to show that some pair of 
assemblies Ci and Cj are connected by glues with strength at least r. By Lemma 3.1, there exist 
Ci and Cj with subsets C[ C d and C- C Cj such that C[ and C' are sibling nodes in T. Because 
they are siblings, the glues between C[ and C'- have strength at least r. By monotonicity these 
glues suffice to bind Cj to Cj, so Is-Producible- Assembly is correct. 

Running time: Let n = \a\. The running time of the Is-Producible- ASSEMBLY (Algorithm 1) is 
polynomial in n, but the algorithm can be optimized to improve the running time to 0(n log n) by 
careful choice of data structures. Is-Producible- Assembly- Fast (Algorithm 2) shows pseudo- 
code for this optimized implementation, which we now describe. Let n = \a\. Instead of searching 
over all pairs of assemblies, only search those pairs of assemblies that are adjacent. This number is 
0{n) since a grid graph has degree at most 4 (hence 0{n) edges) and the number of edges in the 
full grid graph of a is an upper bound on the number of adjacent assemblies at any time. This can 
be encoded in a dynamically changing graph Gc whose nodes are the current set of assemblies and 
whose edges connect those assemblies that are adjacent. 

Each edge of Cc stores the total glue strength between the assemblies. Whenever two assemblies 
Ci and C2, with |Ci| > IC2I without loss of generality, are combined to form a new assembly, Gc is 
updated by removing C2, merging its edges with those of Ci, and for any edges they already share 
(i.e., the neighbor on the other end of the edge is the same), summing the strengths on the edges. 
Each update of an edge (adding it to Ci, or finding it in Ci to update its strength) can be done in 
O(logn) time using a tree set data structure to store neighbors for each assembly. 

We claim that the total number of such updates of all edges is 0(n log n) over all time, or 
amortized O(logn) updates per iteration of the outer loop. To see why, observe that the number of 
edges an assembly has is at most linear in its size, so the number of new edges that must be added 
to Ci, or existing edges in Ci whose strengths must be updated, is at most (within a constant) the 
size of the smaller component C2. The total number of edge updates is then, if T is the assembly 
tree discovered by the algorithm, X^no^es «gt ^^^^{1^^^ ('")!' kig'^t(u)|}, where |left(u)| and |right(n)| 
respectively refer to the number of leaves of n's left and right subtrees. For a given number n of 
leaves, this sum is maximized with a balanced tree, and in that case (summing over all levels of 
the tree) is Xli^o" ^*("'/^*) ~ 0(n.logn). So the total time to update all edges is O(nlog^n). 

As for actually finding Ci and C2, each iteration of the outer loop, we can just look for the pair 
of adjacent assemblies with the largest connection strength. So store the edges in a heap and we 
can simply grab the strongest edge off the top and update the heap by updating the keys containing 
Ci whose connection strength changed and removing those containing C2 but not Ci. The edges 
whose connection strength changed correspond to precisely those neighbors that Ci and C2 shared 
before being merged. Therefore IC2I is an upper bound on the number of edge updates required. 
Thus the amortized number of heap updates is 0(log n) per iteration of the outer loop by the same 
argument as above. Thus it takes amortized time 0(log n) per iteration if each heap operation is 
O(logn). Therefore (this non-naive implementation of) the algorithm takes 0(n log n) time. 

The algorithm Is-Producible- Assembly- Fast (Algorithm 2) implements this optimized idea. 



Algorithm 2 Is-Producible- ASSEMBLY- FAST(a, r) 



1: input: assembly a and temperature r 

2: Vc -^ { {v} I V € dom a } / / (positions defining) subassemblies of a; initially individual 
(positions of) tiles 

3: Ec ■<— {{{^i}, {v}} I {u} € Vc and {f } G T^ and n and v are adjacent and interact} 

4: H -^ empty heap 

5: for all {{u}, {v}} G Ec do 

6: it;({u}, {?;}) ■<— strength of glue binding a{u) and a{v) 

7: inseTt{{{u} , {v}} , H) 

8: end for 

9: while iVcl > 1 do 

10; {Ci,C2} ^ remove-max(ff) // assume \Ci\ > IC2I without loss of generality 

11: if w(Ci, C2) < r then 

12: print "a is not producible" and exit 

13: end if 

14: remove C2 from 14: 

15: for all neighbors C of C2 do 

16: remove {C2,C} from Ec and /i^ 

17: if {Ci,C} e E^c then 

18: w{Ci,C)^w{Ci,C)+w{C2,C) 

19: increase-key ({Ci, C}, -ff) // update edge {Ci,C} with new weight 

20: else 

21: w{Ci,C)^w{C2,C) 

22: add {Ci,C} to Ec and if 

23: end if 

24: end for 

25: end while 

26: print "a is producible" 



The terminology for heap operations is taken from [7]. Note that the way we remove Ci and C2 
and add their union is to simply delete C2 and then update Ci to contain C2's edges. The graph Gc 
discussed above is Gc = (Vc, Ec) where Vc and Ec are variables in Is-Producible- Assembly- Fast. 
The weight function w is used by the heap H to order its elements. 

Summarizing the analysis, each data structure operation takes time O(logn) with appropriate 
choice of a backing data structure. The two outer loops (lines 5 and 9) take 0{n) iterations. The 
inner loop (line 15) runs for amortized O(logn) iterations, and its body executes a constant number 
of O(logn) time operations. Therefore the total running time is 0(nlog n). D 



4 Linear-time verification of temperature 1 unique production 

This section shows that there is an algorithm, faster than the previous known algorithm [5], that 
solves the temperature 1 unique producibility verification (UPV) problem: given an assembly a and 
a temperature-1 hierarchical tile system T, decide if a is the unique producible, terminal assembly 
of T. This is done by showing an algorithm for the temperature 1 UPV problem in the seeded model 

6 



(which is faster than the general-temperature algorithm of [2]), and then applying the technique 
of [5] relating producibility and terminality in the temperature 1 seeded and hierarchical models. 

Define the decision problems sUPVi and hUPVi by the language { {T,a) \ AoiT] = {a} }, 
where T is a temperature 1 seeded TAS in the former case and a temperature 1 hierarchical TAS 
in the latter case. To simplify the time analysis we assume |7~| = 0(|a|). 

The following is the only result in this paper on the seeded aTAM. 

Theorem 4.1. There is an algorithm that solves the sUPVi problem in time 0{\a\ log |T|). 

Proof. Let T = (T, s, 1) and a be a instance of the sUPVi problem. We first check that every 
tile in a appears in T, which can be done in time 0(|a| log \T\) by storing elements of T in a data 
structure supporting O(logn) time access. In the seeded aTAM at temperature 1, a is producible if 
and only if it contains the seed s and its binding graph is connected, which can be checked in time 
0(|a|). We must also verify that a is terminal, which is true if and only if all glues on unbound 
sides are null, checkable in time 0(|a|). 

Once we have verified that a is producible and terminal, it remains to verify that T uniquely 
produces a. Adleman, Cheng, Goel, Huang, Kempe, Moisset de Espanes, and Rothemund [2] 
showed that this is true (at any temperature) if and only if, for every position p G dom a, if Op C a 
is the maximal producible subassembly of a such that p dom Op, then a{p) is the only tile type 
attachable to ap at position p. They solve the problem by producing each such Op and checking 
whether there is more than one tile type attachable to ap at p. We use a similar approach, but we 
avoid producing each ap by exploiting special properties of temperature 1 producibility. 

Given p,q & dom a such that p ^ q, write p ^ q if, for every producible assembly 13, q £ 
dom /3 =^ p € dom /3, i.e., the tile at position p must be present before the tile at position q can 
be attached. We must check each p € dom a and each position q € dom a adjacent to p such that 
p -/i. qio see whether a tile type t ^ a{p) shares a positive-strength glue with a{q) in direction q — p 
(i.e., whether, if a{p) were not present, t could attach at p instead). If we know which positions 
q adjacent to p satisfy p -/< q, this check can be done in time 0(log |r|) with appropriate choice of 
data structure, implying total time 0(|a| log |T|) over all positions p € dom a. It remains to show 
how to determine which adjacent positions p,q £ dom a satisfy p < q. 

Recall that a cut vertex of a connected graph is a vertex whose removal disconnects the graph, 
and that a subgraph is biconnected if the removal of any single vertex from the subgraph leaves it 
connected. Every graph can be decomposed into a tree of biconnected components, with cut vertices 
connecting different biconnected components (and belonging to all biconnected components that 
they connect). If p is not a cut vertex of the binding graph of a, then dom ap is simply dom a\{p} 
because, for all q G dom a\{p}, p -/< q. If p is a cut vertex, then p ^ q\f and only if removing p from 
the binding graph of a places q and the seed position in two different connected components. This 
is because the connected component containing the seed after removing p corresponds precisely to 
ap. 

Run the linear time Hopcroft-Tarjan algorithm [18] for decomposing the binding graph of a 
into a tree of its biconnected components, which also identifies which vertices in the graph are 
cut vertices and which biconnected components they connect. Recall that the Hopcroft-Tarjan 
algorithm is an augmented depth-first search. Root the tree with s's biconnected component (i.e., 
start the depth- first search there), so that each component has a parent component and child 
components. In particular, each cut vertex p has a "parent" biconnected component and A; > 1 
"child" biconnected components. Removing such a cut vertex p will separate the graph into A: + 1 



connected components: the nodes in the k subtrees and the remaining nodes connected to the 
parent biconnected component of p. Thus p ~< q ii and only if p is a cut vertex and q is contained 
in the subtree rooted at p. 

This check can be done for all positions p and their < 4 adjacent positions q in linear tiine by 
"weaving" the checks into the Hopcroft-Tarjan algorithm. As the depth-first search executes, each 
vertex p is marked as either unvisited, visiting (meaning the search is currently in a subtree rooted 
at p) , or visited (meaning the search has visited and exited the subtree rooted at p). If p is marked 
as visited or unvisited at the time q is processed, then q is not in the subtree under p. If p is marked 
as visiting when q is processed, then q is in p's subtree. 

At the time q is visited, it may not yet be known whether p is a cut vertex. To account for 
this, simply run the Hopcroft-Tarjan algorithm twice, doing the checks just described on the second 
execution, using the cut vertex information obtained on the first execution. D 

Theorem 4.2. There is an algorithm that solves the hUPVi problem in time 0(|a||T| log |T|). 

Proof. Cannon, Demaine, Demaine, Eisenstat, Patitz, Schweller, Summers, and Winslow [5] showed 
that a temperature 1 hierarchical TAS T = (T, 1) uniquely produces a if and only if, for each s € T, 
the seeded TAS Ts = {T, s, 1) uniquely produces a. Therefore, the hUPVi problem can be solved by 
calling the algorithm of Theorem 4.1 |T| times, resulting in a running time of 0(|q;| |7~| log |7~|). D 



5 Unions of producible assemblies are producible 

Throughout this section, fix a hierarchical TAS T = {T,t). Let a,/3 be assemblies. We say a and 
(3 are consistent if a{p) = j3{p) for all points p G dom a n dom /3. If a and /3 are consistent, let 
a U /3 be defined as the assembly (a U j3){p) = a{p) if a is defined, and (a U /3)(p) = /3(p) if ol{p) is 
undefined. If a and /3 are not consistent, let a U /3 be undefined. 

Theorem 5.1. If a,j3 are producible assemblies that are consistent and dom a ndom /3 7^ 0, then 
aU /3 is producible. Furthermore, q ^ a U /3, i.e., it is possible to assemble exactly a, then to 
assemble the missing portions of (3. 

Proof. If a and /3 are consistent and have non-empty overlap, then a U /3 is necessarily stable, since 
every cut of a U /3 is a superset of some cut of either a or /3, which are themselves stable. 
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(a) First operation to combine the assembly trees 
for a and /3. Zi and I2 are two leaves representing 
the same position in dom a n dom /3. 




(b) Operation to eliminate one of two leaves h and 
I2 representing the same tile in the tree while pre- 
serving that all attachments are stable. 



Figure 1: Procedure to construct assembly tree for aU/? from separate assembly trees for a and /3. 



Let Tq and T^ be assembly trees for a and /?, respectively. Define an assembly tree T for aU/3 
by the following construction. Let l\ be a leaf in T„ and let l2 be a leaf in T^ representing the 
same position x G dom a n dom /3, as shown in Figure 1(a). Remove I2 and replace it with the 
entire tree Tq,. Call the resulting tree T'. At this point, T' is not an assembly tree if a and /3 
overlapped on more than one point, because every position in dom a n dom /? \ {x} has duplicated 
leaves. Therefore the tree T' is not a hierarchical division of the set dom a U dom /3, since not 
all unions represented by each internal node are disjoint unions. However, each such union does 
represent a stable assembly. We will show how to modify T' to eliminate each of these duplicates, 
while maintaining the invariant that each internal node represents a stable attachment, and we 
call the resulting assembly tree T. Furthermore, the subtree T^ that was placed under p2 will not 
change as a result of these modifications, which implies a — >• a U /?. 

The process to eliminate one pair of duplicate leaves is shown in Figure 1(b). Let l\ and I2 be 
two leaves representing the same point in dom afldom /3, and let a be their least common ancestor 
in T, noting that a is not contained in T^ since l^ is not contained in T^. Let "Pa be the parent of 
a. Let r\ be the root of the subtree under a containing l\. Let r2 be the root of the subtree under 
a containing /2- Let p2 be the parent of l2- Remove the leaf l^ and the node a. Set the parent of 
r\ to be p2- Set the parent of r2 to be -pa- 

Since we have replaced the leaf ^2 with a subtree containing the leaf ^1, the subtree rooted at 
r\ is an assembly containing the tile represented by /2; in the same position. Since the original 
attachment of l^ to its sibling was stable, by monotonicity, the attachment represented by p2 is still 
legal. The removal of a is simply to maintain that T is a full binary tree; leaving it would mean 
that it represents a superfluous "attachment" of the assembly r^ to 0. However, it is now legal 
for r2 to be a direct child of -pa-, since r2 (due to the insertion of the entire r\ subtree beneath a 
descendant of r2, again by monotonicity) now has all the tiles necessary for its attachment to the 
old sibling of a to be stable. Since a was not contained in T^, the subtree Tq, has not been altered. 

This process is iterated for all duplicate leaves. When all duplicates have been removed, T is a 
valid assembly tree with root a U /3. Since T contains T^ as a subtree, a — >■ a U /3. D 

6 Conclusion 

Theorem 5.1 shows that if assemblies a and /3 overlap consistently, then aU/3 is producible. What 
if a = /3? Suppose we have three copies of a, and label them each uniquely as ai, 02, as. Suppose 
further than a^ overlaps consistently with a\ when translated by some non-zero vector v. Then 
we know that a\ U 02 is producible. Suppose that as is 02 translated by i?, or equivalently it is 
a\ translated by 2v. Then 02 U as is producible, since this is merely a translated copy of ai U a2- 
It seems intuitively that ai U 02 U as should be producible as well. However, while a\ overlaps 
consistently with a2, and a2 overlaps consistently with as, it could be the case that as intersects 
a\ inconsistently, i.e., they share a position but put a different tile type at that position. In this 
case ai U a2 U as is undefined. See Figure 2 for an example. 

However, in this case, although ai U 02 U as is not producible (in fact, not even defined), 
"enough" of as (say, a3 C as) can grow off of a\ U a2 to allow a fourth copy a4 to begin to 
grow to an assembly to which a fifth copy 05 can attach, etc., so that an infinite assembly can 
grow by "pumping" additional copies of cJ-^^. Is this always possible? In other words, is it the 
case that if a is a producible assembly of a hierarchical TAS T, and a overlaps consistently with 
some non-zero translation of itself, then T necessarily produces an infinite assembly? If true, this 




"■ 


r-i 




1 1 






a 


1 WJ 



a 



(b) 



a 







a 



(c) 



-n 



TTgn 



a 




Figure 2: (a) A producible assembly a. Gray tiles are all distinct types from each other, but red, green, 
and blue each represent one of three different tile types, so the two blue tiles are the same type, (b) By 
Theorem 5.1, a\ U a-i is producible, where a\— a and a2 = en + (2, —2), because they overlap in only one 
position, and they both have the blue tile type there, (c) a\ and as both have a tile at the same position, 
but the types are different (red in the case of a\ and green in the case of as), (d) However, a subassembly 
a^ of each new Ui can grow, enough to allow the translated equivalent subassembly aj;^i of a^+i to grow 
from a^, so an infinite structure is producible. 



would imply that no hierarchical TAS producing such an assembly could be uniquely produce a 
finite shape. This would settle an open question posed by Chen and Doty [6], who showed that as 
long as a hierarchical TAS does not produce assemblies that consistently overlap any translation of 
themselves, then the TAS cannot uniquely produce any shape in time sublinear in its diameter. 
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A Formal definition of the abstract tile assembly model 

This section gives a terse definition of the abstract Tile Assembly Model (aTAM, [29]). This is not 
a tutorial; for readers unfamiliar with the aTAM, [24] gives an excellent introduction to the model. 

Fix an alphabet E. S* is the set of finite strings over S. Given a discrete object O, (O) denotes 
a standard encoding of O as an element of S*. Z, Z"*", and N denote the set of integers, positive 
integers, and nonnegative integers, respectively. For a set A^ 'PiA) denotes the power set of A. 
Given ^ C Z^, the full grid graph of A is the undirected graph G^ = {V, E), where V = A, and for 
all u,v ^V, {u,v} £ E <^=^ \\u — f II2 = 1; i.e., if and only if u and v are adjacent on the integer 
Cartesian plane. A shape is a set 5 C Z^ such that G^ is connected. 

A tile type is a tuple t E (S* x N)^; i.e., a unit square with four sides listed in some standardized 
order, each side having a glue label (a.k.a. glue) £ € S* and a nonnegative integer strength, denoted 
str{i). For a set of tile types T, let A(T) C S* denote the set of all glue labels of tile types in 
T. If a glue has strength 0, we say it is null, and if a positive-strength glue facing some direction 
does not appear on some tile type in the opposite direction, we say it is functionally null. We 
assume that all tile sets in this paper contain no functionally null glues. ^ Let {N,S, E,W} denote 
the directions consisting of unit vectors {(0, 1), (0, — 1), (1,0), (— 1, 0)}. Given a tile type t and a 
direction d G {N, S, E, W}, t{d) € A(T) denotes the glue label on t in direction d. We assume a finite 
set T of tile types, but an infinite number of copies of each tile type, each copy referred to as a tile. 
An assembly is a nonempty connected arrangement of tiles on the integer lattice Z^, i.e., a partial 
function a : Z^ ---> T such that G^^^^^ is connected and dom a ^ 0. The shape of a is dom a. 
Write |a| to denote [dom a\. Given two assemblies a, (3 -.I? --■> T, we say a is a subassembly of /3, 
and we write a C /?, if dom a C dom /3 and, for all points p € dom a, a{p) = f3{p). 

Given two assemblies a and /3, we say a and /3 are equivalent up to translation, written a ~ /3, 
if there is a vector x G Z^ such that dom a = dom /3 + x (where for AQl?, A + x is defined to be 
{p + X \ p ^ A }) and for all p E dom /3, a{p+x) = I3{p). In this case we say that /3 is a translation 
of a. We have fixed assemblies at certain positions on 7? only for mathematical convenience in 
some contexts, but of course real assemblies float freely in solution and do not have a fixed position. 

Let a be an assembly and let p € dom a and d € {N, S, E, W} such that p + (i E dom a. Let 
t = a{p) and t' = a{p + d). We say that the tiles t and t' at positions p and p + d interact if 
t{d) = t'{—d) and str{t{d)) > 0, i.e., if the glue labels on their abutting sides are equal and have 
positive strength. Each assembly a induces a binding graph G^, a grid graph G = {Va,Ea), where 
Va = dom a, and {pi,P2} G Ea <J=^ ct{pi) interacts with a{p2).^ Given r G Z"*", a is r-stable if 
every cut of G„ has weight at least r, where the weight of an edge is the strength of the glue it 
represents. That is, a is r-stable if at least energy r is required to separate a into two parts. When 
r is clear from context, we say a is stable. 



^This assumption does not affect tfie results of this paper. It is irrelevant for Tlieorem 5.1 or the correctness of 
the algorithms in the other theorems. It also does not affect the running time results for algorithms taking a TAS 
as input, because we can preprocess T in linear time to find and set to null any functionally null glues. The number 
of glues is 0{\T\), and we assume that each glue from glue set G is an integer in the set {0, . . . , IG] — 1}. We can 
use a Boolean array of size \G\ to determine in time 0(|T|) which glues appear on the north that do not appear on 
the south of some tile type. Repeat this for each of the remaining three directions. Then replace all functionally null 
glues in T with null glues, which takes time 0(|r|). To do this replacement in an assembly a takes time 0{\a\). 

^For Gdoma = (Vdomc-EdomQ) and Ga = {Va,Ea), Ga is a spanning subgraph of G'^^^m a- Va = Vdom q and 
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A.l Seeded aTAM 

A seeded tile assembly system (seeded TAS) is a triple T = {T,a,T), where T is a finite set of tile 
types, a : 1? ---> T is the finite, r-stable seed assembly, and r € Z+ is the temperature. Let \T\ 
denote |r|. If T has a single seed tile s G T (i.e., (t(0,0) = s for some s a T and is undefined 
elsewhere), then we write T = {T,s,t). Given two r-stable assemblies a,/3 : Z^ ---> T, we write 
a -^1 P if a Q 13 and |dom /3 \ dom a| = 1. In this case we say a T -produces /? in one step.^ If 
a -^1 P, dom (3 \ dom a = {p}, and t = I3{p), we write (3 = a + {p t-^ t). 

A sequence of k G Z"*" assemblies a = (ao,ai, . . . , afc-i) is a T-assembly sequence if, for all 
1 < i < /c, Oj-i —7-^ Qj. We write a ^-^ /3, and we say a T-produces /3 (in or more steps) if there 
is a T-assembly sequence a = {a, ai, 02, ■ ■ ■ , Ok-i = P) of length k = |dom /3 \ dom a\ + 1. We say 
a is T-producible if o" — 7>^ a, and we write A[T] to denote the set of T-producible assemblies. The 
relation ^^ is a partial order on A[T] [19,23]. 

An assembly a is T -terminal if a is r-stable and d'a = 0. We write w4n[7^ Q A[T] to denote 
the set of T-producible, T-terminal assemblies. 

A seeded TAS T is directed (a.k.a., deterministic, confluent) if the poset (^[7^, —>''') is directed; 
i.e., if for each a, /3 € -4[7^, there exists 7 € A\T\ such that a — > 7 and /? — )• 7.^ We say that T 
uniquely produces a if »4n[T] = {a}. 

A. 2 Hierarchical aTAM 

A hierarchical tile assembly system (hierarchical TAS) is a pair T = {T,t), where T is a finite set 
of tile types, and r € Z^ is the temperature. Let a,/3 : Z^ ---> T be two assemblies. Say that a 
and /3 are nonoverlapping if dom q n dom /3 = 0. If a and /3 are nonoverlapping assemblies, define 
aU/3 to be the assembly 7 defined by 7(p) = a(p) for all p € dom a, 7(p) = f3{p) for all p G dom /3, 
and 7(p) is undefined for all p € Z^ \ (dom a U dom /?). An assembly 7 is singular if 7(p) = t for 
some p € Z^ and some t G T and 7(p') is undefined for all p' € Z^ \ {p}. Given a hierarchical TAS 
T = {T,t), an assembly 7 is T-producible if either 1) 7 is singular, or 2) there exist producible 
nonoverlapping assemblies a and /3 such that 7 = a U /3 and 7 is r-stable. In the latter case, write 
a + j3 -^1 7. An assembly a is T-terminal if for every producible assembly j5 such that a and /3 
are nonoverlapping, aU/3 is not r-stable.^ Define A\T\ to be the set of all T-producible assemblies. 
Define ^n[T] ^ A\T\ to be the set of all T-producible, T-terminal assemblies. A hierarchical TAS 
T is directed (a.k.a., deterministic, confluent) if |^n[7^| = 1. We say that T uniquely produces a 
\fAu[T] = {a}. 

Let T be a hierarchical TAS, and let a G A\T\ be a T-producible assembly. An assembly tree 
T of 3 is a full binary tree with \a\ leaves, whose nodes are labeled by T-producible assemblies, 
with a labeling the root, singular assemblies labeling the leaves, and node u labeled with 7 having 
children ui labeled with a and U2 labeled with j5, with the requirement that a + /3 -^J 7. That 



^Intuitively a — >i j3 means that a can grow into j3 by the addition of a single tile; the fact that we require both a 
and P to be r-stable implies in particular that the new tile is able to bind to a with strength at least r. It is easy to 
check that had we instead required only a to be r-stable, and required that the cut of /? separating a from the new 
tile has strength at least r, then this implies that /3 is also r-stable. 

^The following two convenient characterizations of "directed" are routine to verify. T is directed if and only if 
|^n[T|l = 1- T IS not directed if and only if there exist a,/3 £ A\T\ and p £ dom a fl dom /3 such that a(p) 7^ P{p). 

®The restriction on overlap is a model of a chemical phenomenon known as steric hindrance [28, Section 5.11] or, 
particularly when employed as a design tool for intentional prevention of unwanted binding in synthesized molecules, 
steric protection [15-17]. 
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is, T represents one possible pathway through which a could be produced from individual tile 
types in T. Let T(T) denote the set of all assembly trees of T. If a is a descendant node of /3 
in an assembly tree of T, write a — >■' /?. Say that an assembly tree is T -terminal if its root is a 
T-terminal assembly. Let Tn(T) denote the set of all T-terminal assembly trees of T. Note that 
even a directed hierarchical TAS can have multiple terminal assembly trees that all have the same 
root terminal assembly. 

When T is clear from context, we may omit T from the notation above and instead write ^-i, 
-^, da, assembly sequence, produces, producible, and terminal. 
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